Sharper Bounds for Regularized Data Fitting
نویسندگان
چکیده
We study matrix sketching methods for regularized variants of linear regression, low rank approximation, and canonical correlation analysis. Our main focus is on sketching techniques which preserve the objective function value for regularized problems, which is an area that has remained largely unexplored. We study regularization both in a fairly broad setting, and in the specific context of the popular and widely used technique of ridge regularization; for the latter, as applied to each of these problems, we show algorithmic resource bounds in which the statistical dimension appears in places where in previous bounds the rank would appear. The statistical dimension is always smaller than the rank, and decreases as the amount of regularization increases. In particular, for the ridge low-rank approximation problem minY,X‖Y X−A‖F + λ‖Y ‖F + λ‖X‖F , where Y ∈ Rn×k and X ∈ Rk×d, we give an approximation algorithm needing O(nnz(A)) + Õ((n + d)ε−1kmin{k, ε−1 sdλ(Y ∗)}) + poly(sdλ(Y ∗) −1) time, where sλ(Y ∗) ≤ k is the statistical dimension of Y ∗, Y ∗ is an optimal Y , ε is an error parameter, and nnz(A) is the number of nonzero entries of A. This is faster than prior work, even when λ = 0. We also study regularization in a much more general setting. For example, we obtain sketching-based algorithms for the low-rank approximation problem minX,Y ‖Y X−A‖F +f(Y,X) where f(·, ·) is a regularizing function satisfying some very general conditions (chiefly, invariance under orthogonal transformations). 1998 ACM Subject Classification G.1.3 Numerical Linear Algebra
منابع مشابه
Sharper Bounds for Regression and Low-Rank Approximation with Regularization
We study matrix sketching methods for regularized variants of linear regression, low rank approximation, and canonical correlation analysis. Our main focus is on sketching techniques which preserve the objective function value for regularized problems, which is an area that has remained largely unexplored. We study regularization both in a fairly broad setting, and in the specific context of th...
متن کاملSpherical harmonics-based parametric deconvolution of 3D surface images using bending energy minimization
Numerical deconvolution of 3D fluorescence microscopy data yields sharper images by reversing the known optical aberrations introduced during the acquisition process. When additional prior information such as the topology and smoothness of the imaged object surface is available, the deconvolution can be performed by fitting a parametric surface directly to the image data. In this work, we incor...
متن کاملThe Structure of Bhattacharyya Matrix in Natural Exponential Family and Its Role in Approximating the Variance of a Statistics
In most situations the best estimator of a function of the parameter exists, but sometimes it has a complex form and we cannot compute its variance explicitly. Therefore, a lower bound for the variance of an estimator is one of the fundamentals in the estimation theory, because it gives us an idea about the accuracy of an estimator. It is well-known in statistical inference that the Cram&eac...
متن کاملMaximum Margin Multiclass Nearest Neighbors
We develop a general framework for margin-based multicategory classification in metric spaces. The basic work-horse is a margin-regularized version of the nearest-neighbor classifier. We prove generalization bounds that match the state of the art in sample size n and significantly improve the dependence on the number of classes k. Our point of departure is a nearly Bayes-optimal finite-sample r...
متن کاملA primal-dual semi-smooth Newton method for nonlinear L data fitting problems
This work is concerned with L 1 data fitting for nonlinear inverse problems. This formulation is advantageous if the data is corrupted by impulsive noise. However, the problem is not differentiable and lacks local uniqueness, which makes its efficient solution challenging. By considering a regularized primal-dual formulation of this problem, local uniqueness can be shown under a second order su...
متن کامل